Coding Standards for R and Python

Tom Wilson

4 October 2023

Contents

The R programming language logo The Python programming language logo

  • What are coding standards?
  • Why apply standards to your code?
  • Style guides typically used for R and Python.
  • Code formatting and linting tools.
  • Precommit and continuous integration.
  • Creating documentation.

Some suggestions, not a definitive guide to the above!

What are coding standards?

Guidelines and best practices for writing clean, consistent, maintainable code. They include:

  • Naming conventions for variables, functions, classes, files.
  • Formatting rules to make your code easier to read.
  • Documenting and commenting standards.
  • Best practices for writing efficient code, reducing errors.


Coding standards are often described in Style Guides.

Why apply a coding standard?

“Code is read much more often than it is written”
Guido van Rossum

  • To make your code more readable.
  • Therefore more easily understood.
  • Therefore more likely to find reviewers, collaborators.
  • Make your code look more professional.
  • It saves you inventing your own conventions.

Style Guides

The R programming language logo

The Python programming language logo

Tidyverse Style Guide

The R programming language logo

Some of its syntax suggestions:

  • Variables and function names should only contain lowercase letters, numbers and _
  • Always use <- not = for variable assignment
  • One space after commas and either side of assignment <- = and comparators like > < == etc
  • Keep to lines to max 80 characters
  • Use " not ' for quoting text
  • When using function, if, for, while:
    • Opening { should be last character on a line
    • Then two space indent
    • Closing } should be first character on a line

Tidyverse Style Guide

The R programming language logo

A simple example:

# Ignoring TidyVerse Style Guide ----
myVector=c(10,20,30)

myCalc=function(x,Multiplier){x*Multiplier}

myCalc(myVector,Multiplier=10)


# Applying TidyVerse Style Guide ----
my_vector <- c(10, 20, 30)

my_calc <- function(x, multiplier) {
  x * multiplier
}

my_calc(my_vector, multiplier = 10)

PEP 8

The Python programming language logo

Some of its requirements:

  • Lowercase variable and function names with words separated by underscores
  • Classes should start each word with a capital and no spaces (CapWords case)
  • Use four spaces for indentation (not tab)
  • Spaces around operators and after commas, but not directly inside parentheses or brackets
  • A maximum line length of 79 characters (72 for comments or docstrings)
  • Functions should have a docstring that explains what the function does, its arguments, and its return value

PEP 8

The Python programming language logo

A simple example:

# Multiplies input by a multiiplier
def Multiply(inputVal,MultiplierVal=10):
 return inputVal*MultiplierVal

def multiply(input_val, multiplier_val=10):
    """Multiplies an input by a multiplier.

    Args:
        input_val: The input value.
        multiplier_val: The multiplier value.

    Returns:
        The product of the input value and the multiplier value.
    """

    return input_val * muliplier_val

There is a separate PEP for docstrings see PEP 257

Tools to help

  • Code formatters

    • reformat code to match a particular style.
    • mostly spacing, indentation, line length, not renaming variables!
  • Code linters

    • flag potential bugs, poor practice, style violations.
    • linters do not change the code, but inform of problems.

    Tool Type R Packages Python Packages
    Formatter styler, formatR autopep8, Black
    Linter lintr Flake8, pylint

styler and lintr

The R programming language logo

Use these tools from the R Studio Addins menu once the packages are installed.

R Studio Addin menu for styler R Studio Addin menu for lintr

In R Studio any issues flagged by lintr are shown in a Markers tab:

Example output from running lintr on R code

Unfortunately R Studio does not have on save option for running formatters. See pre-commit hook if using Git.

Black and pylint

The Python programming language logo After installing the packages with pip or conda can apply formatting or linting checks using a terminal:

An example of running black in a terminal An example of running pylint in a terminal

However, an easier way to use these tools is within an IDE such as VS Code, PyCharm.

Formatting and linting in VS Code

The Visual Studio Code Symbol

  • VS Code has code formatter and linter extensions for Python.

  • Linting and formatting tools can be configured to run every time save.

The VS Code extension for installing Black The VS Code extension for installing Flake 8

  • VS Code also has a useful extension for automatically generating function docstrings.

The VS Code extension for installing Docstring Creator

Pre-commit Hooks

  • Part of a git workflow. Pre-commit hooks ensure code meets certain standards before it is committed.
  • Checks are specified in a .pre-commit-config.yaml file in the code repository.
  • These checks can include the formatting and linting tools described previously.
  • For Python see pre-commit.
  • For R it’s a bit more complicated to setup. Python pre-commit must also be installed. The R package precommit has a useful setup article.

An example pre-commit config yaml file text

Continuous Integration (CI)

  • Automated builds and tests are run when code changes are pushed to a central repository.
  • GitHub Actions might be used to run a specific formatter or linter as part of a CI pipeline.

The GitHub Actions Symbol and GitHub description of it





  • GitHub Actions can be configured to run when opening a GitHub Pull Request.
  • GitHub Actions can be configured so cannot merge a branch unless the checks pass.
  • This is an advanced topic. Other CI tools are available Travis CI, Jenkins.

Creating Documention

Both R and Python have specific ways to document functions (and classes in Python).

  • Python Docstrings are wrapped in """. See PEP 257.
  • Docstrings go directly under def function_name:

For example:

def multiply(input_val, multiplier_val=10):
  """Multiplies an input by a multiplier.

  Args:
    input_val: The input value.
    multiplier_val: The multiplier value.

  Returns:
    The product of the input value and the multiplier value.
  """
  return input_val * muliplier_val
  • With Python can see the documentation for a function using print(my_function_name.__doc__)

Function Documentation R

  • For R functions add Roxygen comments to document them.
  • Using R Studio insert a Roxygen template (R Studio > Code menu > Insert Roxygen Skeleton).
#' My calculation
#'
#' @param x numeric vector or number to be multiplied.
#' @param multiplier the scaler value to multiply x by.
#'
#' @return scaled value
#' @export
#'
#' @examples
#' my_calc(c(1, 2, 3), 10)
my_calc <- function(x, multiplier) {
  x * multiplier
}
  • When developing an R package roxygen2::roxygenise() is used to generate the documentation .Rd files from the roxygen comments.
  • See documentation section of the Tidyverse Style Guide and also in the R Packages online book.

Summary